home *** CD-ROM | disk | FTP | other *** search
-
- FILE menu:
-
- The FILE is divided into three sections: Single Word Functions,
- Phrase Functions, and Miscellaneous.
-
- The FILE menu has 12 available selections: Extract Single Words,
- Extract Capitalized Words, Build Single Word Index, Word
- Frequency, Spinoff Unique Words, Extract Phrases, Extract
- Personal Names, Build Phrase Index, View Index on Screen, Print
- Index to Printer, Save Defaults, and Go to DOS.
-
- ┌───────────────────────────────────────────────────────────────┐
- │ ┌──────────────────────────┐ │
- │ │ ──Single Word Functions──│ │
- │ │ Extract Single Words │ │
- │ │ Extract Capitalized Words│ │
- │ │ Build Single Word Index │ │
- │ │ Word Frequency │ │
- │ │ Spinoff Unique Words │ │
- │ │ ──Phrase Word Functions──│ │
- │ │ Extract Phrases │ │
- │ │ Extract Personal Names │ │
- │ │ Build Phrase Index │ │
- │ │ ─────Miscellaneous───────│ │
- │ │ View Index on Screen │ │
- │ │ Print Index to Printer │ │
- │ │ Save Defaults │ │
- │ │ Go to DOS │ │
- │ └──────────────────────────┘ │
- │ │
- │ PC─INDEX 4.0─Index Generator Copyright 1989─91 Help Software │
- └───────────────────────────────────────────────────────────────┘
-
- This menu is broken down into three categories. The first
- category is Single Word Functions, the second section contains
- Phrase Functions, and the last is Miscellaneous Functions.
-
- Extract Single Words
-
- Extract Single Words is the first item in the menu. It is also
- the first step performed in creating a single word index. It's
- function is to extract each individual word from a document and
- record it.
-
- This option will extract all words in a document, one at a time,
- and record them in sorted order along with the page number that
- they occur on.
-
- Before you begin with the Extract Words selection, you need to
- select the proper document type from the DOCUMENT menu.
-
- Select the Extract Single Words option from the FILE menu. You
- should now see a new window asking you for an input filename, an
- output filename, the page size, the first page number to start
- indexing on, and the first page number to use and several other
- options.
-
- For the input filename, enter the name of the document that you
- want to index and press enter. For the output filename type any
- name you want and press enter. The output file is not the index,
- but a sorted list of all words in the document and the page
- numbers that they occur on. It is recommended that you use the
- same name as the document with '.srt' as the extension.
-
- The entry for page size is only used if you are using a Text or
- ASCII file. If you are using a word processor supported directly
- by PC─INDEX then you can ignore this entry. For a list of word
- processors supported by PC─INDEX, look in the Document menu.
-
- The next entry is Start Indexing on Page. This entry allows you
- to skip a few pages at the beginning of a document before the
- indexing starts. This will let you skip a title page, table of
- contents, or anything else at the beginning of a document that
- you don't want to index.
-
- The First Page Number to use setting will determine what page
- number PC─INDEX will use as the first page number. This entry
- can be used with the Start Indexing on Page setting so that you
- can start indexing on page four, but the first page number will
- be page one.
-
- The rest of the selections can be broken into two types. Which
- word list to use and what type of conversion to perform. One
- selection can be made from the choices in each of the two groups.
-
- The three choices on the left determine what words will be
- included in the index. Here are the options and the effect that
- they will have on an index.
-
- Don't Use any Word List: When this option is selected every word
- in the document will be included in the index. Common words like
- 'a', 'and', 'the', etc. will be indexed using this option.
-
- Use Include Word List: When the Use Include Word list option is
- selected, PC─INDEX will compare the extracted word to the include
- word list. If a match is found, the extracted word will be
- included in the extracted word list and the index.
-
- Use Discard Word List: When the Use Discard Word List option is
- selected, PC─INDEX will compare the extracted word to the discard
- word list. If a match is found, the extracted word will be
- discarded and will not be included in the extracted word list or
- the index.
-
- For consistency, PC─INDEX can convert all words to be the same
- case as they are being extracted. If you want to do any
- conversion, you have three choices. Convert words to UPPER CASE
- will convert all words to upper case, Convert words to lower case
- will convert all words to lower case, and Convert words to UPPER
- & lower case will convert the first letter in the word to upper
- case and the rest of the word to lower case. If you select No
- Conversion then no conversion will take place.
-
- The completed window should look like this:
-
- ┌───────────────────────────────────────────────────────────────┐
- │ Input File Name: (Name of Document to process) │
- │ pci.doc │
- │ │
- │ Output File Name: │
- │ pci.srt │
- │ │
- │ Page Size Start Indexing on Page First Page Number to use │
- │ 60 5 1 │
- │ │
- │ Don't Use Any Word List X Perform No Conversion on Word │
- │ │
- │ Use Include Word List Convert Word to UPPER Case │
- │ │
- │X Use Discard Word List Convert Word to lower Case │
- │ │
- │ Convert Word to UPPER/lower │
- │ │
- └───────────────────────────────────────────────────────────────┘
-
- When you have finished entering the filenames and other
- information, press F10 to begin processing.
-
- Extract Capitalized Words
- The Extract Capitalized Words selection works in exactly the same
- manner as Extract Single Words, except that it only extracts
- capitalized words (i.e. names).
-
- Build Single Word Index
-
- Build Single Word Index is the final step in creating a single
- word index. It takes the file created by the 'Extract Single
- Words' selection and edited by the 'Edit Extracted Word File'
- selection and creates an index.
-
- Select 'Build Single Word Index' from the FILE menu. You will be
- asked for the input file and output file. Enter the name of the
- extracted word file that you created with the Extract Words
- process. This file should have '.SRT' as the filename extension.
-
- Next you will be asked what name you want to use for the output
- file. This is the filename of the index . It is recommended
- that you use the original document name with the extension
- '.NDX'.
-
- The Wildcard Description file is only used if you are processing
- a group of files together. If you indexed a group of files then
- use the same wildcard description filename here. It contains
- information that PC─INDEX needs to complete the index.
-
- Next, PC─INDEX wants to know the page length (how many lines per
- page) you want to use. The default setting is 66 which is the
- proper setting for letter size paper. If you are using legal
- size paper, the proper setting would be 88. This number does not
- need to match the lines per page setting you used when you
- selected 'Extract Words'. Most laser printers will only output
- 60 lines per page. If you will be printing the index on a laser
- printer, you will probably want to set this option to 60.
-
- The next item to fill in is the page width. Here you will enter
- the total number of characters that will fit on one line of your
- printer. The maximum width accepted by PC─INDEX is 132
- characters. The number next to page width in reverse video is
- the calculated width required for the settings you have selected.
- This number (required width) must be smaller than the Page Width
- setting or an error will occur.
-
- Next, PC─INDEX asks you the number of columns you would like the
- output to be in. You will be able to produce an index up to four
- columns wide. An example of a two column index is included at
- the end of this document.
-
- The column width is the next entry. This entry controls the
- width of each column in the index. The minimum allowable width
- is 30 characters and the maximum is 99.
-
- The number of spaces between columns can range from 1 to 9
- characters.
-
- Next fill in the top, bottom, left, and right margins to the
- settings that you wish.
-
- The completed input window should look like this:
-
- ┌───────────────────────────────────────────────────────────────┐
- │ Input File Name: │
- │ pci.srt │
- │ │
- │ Output File Name: │
- │ pci.ndx │
- │ │
- │ Wildcard Description File Name: (Leave Blank if not needed) │
- │ │
- │ │
- │ Page Size Page Width (Columns) Number of Columns │
- │ 66 80 78 2 │
- │ Column Width Space Between Columns Top Margin │
- │ 30 3 5 │
- │ Bottom Margin Left Margin Right Margin │
- │ 5 10 5 │
- └───────────────────────────────────────────────────────────────┘
-
- When you have finished entering the filenames and other
- information, press F10 to begin processing.
-
- You should see a status box which tells you the number of words
- to be processed, the number of words actually processed, the
- letter of the alphabet currently being processed, percentage
- completed, and the elapsed time.
-
- When this is finished, you will be returned to the main menu and
- the completed index is contained in the text file under the name
- you entered. If you wish to view the file you can select View
- Index from the File Menu. If you want to print the index to a
- printer select Print Index from the File Menu. Since the index
- file is an ASCII file, you could also load it into almost any
- word processor and edit it further if you wish.
-
-
- Word Frequency List
-
- The Word Frequency List selection builds a word frequency list.
- This list contains all unique words found in a document in
- alphabetical order and the number of times that each word was
- used. This list is built from an extracted single word file. If
- you want a complete listing of all words, be sure to extract
- words using the 'Don't use any Word List' option.
-
- Enter the name of the extracted word file that you want to
- process for the Input File Name. If you have not already created
- an extracted single word file, then you will need to do this
- first.
-
- Enter any name you want for the output file name. This file will
- be an ASCII text file when finished. For consistency, it is
- recommended that you use the document name with the extension
- '.frq'.
-
- The minimum word count that you are asked for will allow you to
- set a minimum number of occurrences for a word to be included in
- the word frequency file. In other words, if you want only the
- most frequently used words in the word frequency list, you might
- enter 20 or some other large number in the Minimum Word Count
- entry. This way only words occurring 20 or more times would be
- included in the word frequency list.
-
- Spinoff Unique Words
-
- The Spinoff Unique Words selection creates a file of phrases from
- an extracted single word file. This can be helpful when creating
- a customized list of phrases.
-
- This option will through an extracted word file and write out all
- unique words to a phrase file. By editing the '.srt' file with
- the Edit Extracted word file (found under the Edit Menu) you can
- mark or un─mark individual words. Then when you spin off a list
- you can spin off either the marked words or the un─marked words.
-
- First select Spinoff List from the File menu. Enter the Input
- File Name. It must be an extracted single word file. Next enter
- the Output File Name. This will be a phrase file and you should
- name it with a '.dbf' extension. Finally enter 'a' or 'i' to
- spin off either active or inactive words. Press F10 and
- processing will begin.
-
- You can change the default file names that PC─INDEX uses for
- phrase list by using the Edit Word List Filenames under the Edit
- menu.
-
- Extract Phrases
-
- Extract Phrases will search through a document and find all
- occurrences of a list of phrases. It is the first step performed
- in creating a phrase index. It's function is to extract each
- individual phrase from a document and record it.
-
- Before you begin with the Extract Phrases selection, you need to
- select the proper document type from the Document menu.
- Select the Extract Phrases option from the FILE menu. You should
- now see a new window asking you for an input filename, an output
- filename, the page size, the first page number to start indexing
- on, and the first page number to use.
-
- For the input filename, enter the name of the document that you
- want to index and press enter. You can press F2 here to select a
- file from a list. For the output filename type any name you
- want and press enter.
-
- The output file is not the index, but a sorted list of phrases in
- the document and the page numbers where they were found. It is
- recommended that you use the same name as the document with
- '.srt' as the extension.
-
- The entry for page size is only used if you are using a text or
- ASCII file. If you use a word processor supported directly by
- PC─INDEX then you can ignore this entry. For a list of word
- processors supported by PC─INDEX, look in the Document menu.
-
- The next entry is Start Indexing on Page. This entry allows you
- to skip a few pages at the beginning of a document before the
- indexing starts. This will let you skip a title page, table of
- contents, or anything else that you don't want to index.
-
- The First Page Number to use setting will determine what page
- number PC─INDEX will use as the first page number. This entry
- can be used with the Start Indexing on Page setting so that you
- can start indexing on page four, but the first page number will
- be page one. This will be useful if you want to skip a few pages
- at the beginning of a document.
-
- The completed window should look like something like this
-
- ┌───────────────────────────────────────────────────────────────┐
- │ Input File Name: (Name of Document to process) │
- │ pci.doc │
- │ │
- │ Output File Name: │
- │ pci.srt │
- │ │
- │ Page Size Start Indexing on Page First Page Number to use │
- │ 66 4 1 │
- └───────────────────────────────────────────────────────────────┘
-
- When you have finished entering the filenames and other
- information, press F10 to begin processing.
-
-
- Extract Personal Names
-
- This menu selection is new to this version of PC─INDEX. Extract
- Personal Names will go through a document finding personal names,
- first and last names and writing them out to a phrase file. This
- file can then be used to create a name index or merged with
- another phrase file to create a more comprehensive index that
- includes names.
-
- This selection is not guaranteed to find all names in a document,
- but it is a good starting point. Usually this option will
- extract capitalized words that are not really names rather than
- omit names.
-
- In order to use this option correctly, it will be helpful to
- understand what is happening. PC─INDEX scans a document until it
- finds at least two capitalized words in a row. If two
- capitalized words are found, then the first word is looked up in
- the Personal Name File. If the name is found then this sequence
- of capitalized words is assumed to be a personal name.
-
- The Personal Name File contains over 12,000 first names. You may
- want to browse through the list using the Edit Personal Name File
- (found in the Edit List Menu) to make sure that it contains names
- you know you need.
-
- When you select Extract Personal Names, you will see a screen
- asking you for an Input File Name, an Output File Name, the
- Maximum Number of Words in a Name, and information regarding the
- surname (last name).
-
- For the input file name enter the name of the document you want
- to extract names from. For the output file name enter any name
- you want. It is recommended that you use a file name with the
- extension '.dbf'.
-
- The maximum number of words in a name can be any number from 2 to
- 6. There must be at least 2 words in a name (a first and last
- name) and no more than 6.
-
- The last three choices tell PC─INDEX how last names can be
- recognized. These choices were added to help PC─INDEX to find
- names faster and more accurately.
-
- The fastest and most accurate method for extracting names is Last
- Name contains ALL CAPS. In order to use this option, all
- surnames must contain all capital letters and names that are not
- surnames cannot contain all caps. If it isn't possible to use
- all caps in last names then use one of the other options. If it
- doesn't matter to you whether last names are all caps or not,
- then it is recommended that you use all caps. The increase in
- speed and accuracy will be significant.
-
- The next option, Last Name is not ALL CAPS tells PC─INDEX that no
- names will contain only capital letters. This is the second
- fastest and second most accurate method for extracting names.
-
- The last option, Last Name may or may not be ALL CAPS should be
- selected if the way capital letters used in names is not
- consistent.
-
- The completed screen should look something like this:
-
-
- ┌──────────────────────────────────────────────────────┐
- │ Input File Name: (Name of Document to process) │
- │ pci.doc │
- │ │
- │ Output File Name: │
- │ pci.dbf │
- │ │
- │ Maximum Number of Words in a Name (2 ─ 6) │
- │ 3 │
- │ │
- │ X Last Name is ALL CAPS │
- │ │
- │ Last Name is not ALL CAPS │
- │ │
- │ Last Name may or may not be ALL CAPS │
- └──────────────────────────────────────────────────────┘
-
- When you have finished entering the filenames and other
- information, press F10 to begin processing.
-
- You should see a status box which tells you the number of words
- to be processed, the number of words actually processed, the
- number of names found, percentage completed, and the elapsed
- time.
-
- After this is complete you can (and probably should) browse
- through and edit the names that were just extracted by selecting
- Edit Extracted Name File from the Edit List Menu. This will
- allow you to correct names if necessary or to delete entries
- completely.
-
- You may want to merge the extracted name file with a phrase file
- so an index will contain both names and phrases. Since the
- extracted name file is actually a phrase file, you can use Merge
- Phrase Files (found in the Merge Files Menu) to accomplish this.
-
-
- Build Phrase Index
-
- Build Phrase Index is the final step in creating a phrase index.
- Build Phrase Index takes the file created by the 'Extract
- Phrases' selection and creates a phrase index.
-
- Select 'Build Phrase Index' from the FILE menu. You will be
- asked for the input file and output file. Enter the name of the
- extracted word file that you created with the Extract Words
- process. This file should have '.SRT' as the filename extension.
-
- Next you will be asked what name you want to use for the output
- file. This is the filename for the final index. It is
- recommended that you use the original document name with the
- extension '.NDX'.
-
- The Wildcard Description file is only used if you are processing
- a group of files together. If you indexed a group of files then
- use the same wildcard description filename here. It contains
- information that PC─INDEX needs to complete the index.
-
- Next, PC─INDEX wants to know the page length (how many lines per
- page) you want to use. The default setting is 66 which is the
- proper setting for letter size paper. If you are using legal
- size paper, the proper setting would be 88. This number does not
- need to match the lines per page setting you used when you
- selected 'Extract Words'. Many laser printers normally print 60
- lines per page. If you will be printing the index on a laser
- printer, you will probably want to set this option to 60.
-
- The next item to fill in is the page width. Here you will enter
- the total number of characters that will fit on one line of your
- printer. The maximum width accepted by PC─INDEX is 132
- characters. The number next to page width in reverse video is
- the calculated width required for the settings you have selected.
- This number (required width) must be smaller than the Page Width
- setting or an error will occur.
-
- Next, PC─INDEX asks you the number of columns you would like the
- output to be in. You will be able to produce an index up to four
- columns wide if your columns are small enough. An example of a
- two column phrase index is included at the end of this document.
-
- The column width is the next entry. This entry controls the
- width of each column in the index. The minimum allowable width
- is equal to the longest phrase in the phrase list that you used,
- and the maximum is 99.
-
- The number of spaces between columns can range from 1 to 9.
-
- Next fill in the top, bottom, left, and right margins to the
- settings that you wish.
-
- The completed input window should look something like this:
-
- ┌───────────────────────────────────────────────────────────────┐
- │ Input File Name: │
- │ pci.srt │
- │ │
- │ Output File Name: │
- │ pci.ndx │
- │ │
- │ Wildcard Description File Name: (Leave Blank if not needed) │
- │ │
- │ │
- │ Page Size Page Width (Columns) Number of Columns │
- │ 66 80 78 2 │
- │ Column Width Space Between Columns Top Margin │
- │ 30 3 5 │
- │ Bottom Margin Left Margin Right Margin │
- │ 5 10 5 │
- └───────────────────────────────────────────────────────────────┘
-
- When you have finished entering the filenames and other
- information, press F10 to begin processing .
-
- You should see a status box which tells you the number of words
- to be processed, the number of words actually processed, the
- letter of the alphabet currently being processed, percentage
- completed, and the elapsed time.
-
- When this is finished, you will be returned to the main menu and
- the completed index is contained in the text file that you named.
- If you wish to view the file you can select View Index from the
- File Menu and enter the name of the index that you just created.
- . If you want to print the index, select Print Index from the
- File Menu. Since the index is an ASCII file, you could also
- load it into most word processors and edit it further if you
- wish.
-
- View Index on Screen
-
- View Index on Screen lets you see how the index you created
- looks. You will probably want to browse the index before you
- print it. You can use this selection to view any ASCII file.
-
-
- Print Index to Printer
-
- Print Index to Printer lets you print an index on your printer.
- If you have a problem using this make sure that you have selected
- the correct printer port.
-
- You can change this using the Edit Default Settings List in the
- Edit List Menu.
-
- Save Defaults
-
- Save Defaults saves the current settings in the DOCUMENT menu.
- It will also save all numeric settings and default word list
- filenames in the various dialogue boxes.
-
-
- Go to DOS
-
- Go to DOS allows you to perform DOS commands. Type EXIT to
- return to PC─INDEX when you are finished.
-
-
-
-
-